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Abstract 



We present a new algorithm for computing m-th roots over the finite field ¥q, where 
q = p", with p a prime, and m any positive integer. In the particular case m = 2, 
the cost of the new algorithm is an expected 0(M(n) log(p) + C(n)log(n)) operations 
in ¥p, where M(n) and C(n) are bounds for the cost of polynomial multiplication and 
modular polynomial composition. Known results give M(n) = 0(n log(n) log log(n)) 
and C(n) = 0{n^'^'^), so our algorithm is subquadratic in n. 

Keywords. Root extraction; square roots; finite field arithmetic. 

Mathematics Subject Classification 2010. Primary 11Y16, 12Y05, Secondary 
68W30. 

1 Introduction 

Beside its intrinsic interest, computing m-th roots over finite fields (for m an integer at least 
equal to 2) has found many applications in computer science. Our own interest comes from 
elliptic and hyperelliptic curve cryptography; there, square root computations show up in 
pairing-based cryptography [3] or point-counting problems [8j. 

Our result in this paper is a new algorithm for computing m-th roots in a degree n 
extension ¥q of the prime field Fp, with p a prime. Our emphasis is on the case where p is 
thought to be small, and the degree n grows. Roughly speaking, we reduce the problem to 
m-th root extraction in a lower degree extension of Fp (when m = 2, we actually reduce the 
problem to square root extraction over Fp itself). 

Our complexity model. It is possible to describe the algorithm in an abstract manner, 
independently of the choice of a basis of Fg over Fp. However, to give concrete complexity 
estimates, we have to decide which representation we use, the most usual choices being 
monomial and normal bases. We choose to use a monomial basis, since in particular our 
implementation is based on the library NTL [20], which uses this representation. Thus, the 
finite field F^ = Fpn is represented as Fp[X]/(/), for some monic irreducible polynomial 
/ G Fp[X] of degree n; elements of ¥g are represented as polynomials in Fp[X] of degree less 
than n. We will briefly mention the normal basis representation later on. 
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The costs of all algorithms are measured in number of operations +, x,-=- in the base 
field Fp (that is, we are using an algebraic complexity model). 

We shall denote upper bounds for the cost of polynomial multiplication and modular 
composition by respectively M(?t,) and C(n). This means that over any field K, we can 
multiply polynomials of degree n in ]K[X] in M{n) base field operations, and that we can 
compute f{g) mod h in C{n) operations in K, when f,g,h are degree n polynomials. We 
additionally require that both M and C are super-linear functions, as in Chapter 8], and 
that M(n) = 0(C(n)). In particular, since we work in the monomial basis, multiplications 
and inversions in ¥q can be done in respectively 0(M(n)) and 0{M{n) log(n)) operations in 
Fp, see again [23] . 

The best known bound for M(n) is 0(nlog(n) loglog(n)), achieved by using Fast Fourier 
Transform [T71 The most well-known bound for C{n) is 

Q(^^{u+i)/2^^ due to Brent and 

Kung [1], where u is such that matrices of size n over any field K can be multiplied in 0{n'^) 
operations in K; this estimate assumes that u > 2, otherwise some logarithmic terms may 
appear. Using the algorithm of Coppersmith and Winograd [6], we can take u < 2.37 and 
thus C{n) = 0{n^'^^)] an algorithm by Huang and Pan [TU] actually achieves a slightly better 
exponent of 1.67, by means of rectangular matrix multiplication. 

Main result. We will focus in this paper on the case of t-th root extraction, where t is a 
prime divisor of g — 1; the general case of m-th root extraction, with m arbitrary, can easily 
be reduced to this case (see the discussion after Theorem [T]) . 

The core of our algorithm is a reduction of t-th root extraction in Fg to t-th root extraction 
in an extension of Fp of smaller degree. Our algorithm is probabilistic of Las Vegas type, so 
its running time is given as an expected number of operations. With this convention, our 
main result is the following. 

Theorem 1. Let t be a prime factor of q — 1, with q = p", and let s be the order of p in 
Z/tZ. Given a G F*, one can decide if a is a t-th power in ¥*, and if so compute one of its 
t-th roots, by means of the following operations: 

• an expected 0{sM{n) log(p) -|- C(n) log(n)) operations in Fp/ 

• a t-th root extraction in ¥pa . 

Thus, we replace t-th root extraction in a degree n extension by a t-th root extraction in 
an extension of degree s < mm{n,t). The extension degree s is the largest one for which t 
still divides p^ — 1, so iterating the process does not bring any improvement: the t-th root 
extraction in Fps must be dealt with by another algorithm. The smaller s is, the better. 

A useful special case is t = 2, that is, we are taking square roots; the assumption that 
t divides g — 1 is then satisfied for all odd primes p and all n. In this case, we have s = 1, 
so the second step amounts to square root extraction in Fp. Since this can be done in 
0(log(p)) expected operations in Fp, the total running time of the algorithm is an expected 
0(M(n) log(p) -|- C(n) log(n)) operations in Fp. 

A previous algorithm by Kaltofen and Shoup JTI\ allows one to compute t-th roots in 
Fpn in expected time 0((M(t)M(n) log(p) -|- tC{n) -\- C(t)M(n)) log(n)); we discuss it further 
in the next section. This algorithm requires no assumption on t, so it can be used in our 
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algorithm in the case s > 1, for t-th root extraction in Wps. Then, its expected running time 
is C>((M(t)M(s) log(p) + tC{s) + C(t)M(s)) log(s)). 

The strategy of using Theorem [T] to reduce from ¥q to Fps then using the Kaltofen-Shoup 
algorithm over Fps is never more expensive than using the Kaltofen-Shoup algorithm directly 
over ¥q. For t = 0(1), both strategies are within a constant factor; but even for the smallest 
case t = 2, our algorithm has advantages (as explained in the last section). For larger t, 
the gap in our favor will increase for cases when s is small (such as when t divides p — 1, 
corresponding to s = 1). 

Finally, let us go back to the remark above, that for any m, one can reduce m-th root 
extraction of a G F* to computing t-th roots, with t dividing q — 1; this is well known, see 
for instance [Ij Chapter 7.3]. We write m = uv with (f , g — 1) = 1 and t \ q — 1 for every 
prime divisor t of u, and we assume that a is indeed an m-th power. 

• We first compute the f-th root ao of a as = a" ^ computing the inverse 
£ of f mod 5' — 1, and computing an £-th power in Fg. This takes 0(?T,M(n) log(p)) 
operations in Fp. 

• Let u = nf=i "^r* be the prime factorization of u, which we assume is given to us. 
Then, for /c = 1, . . . ,ai, we compute an mi-th root of a^-i using Theorem [T| so 
that is an m"^-th root of oq. 

One should be careful in the choice of the mi-th roots (which are not unique), so as to 
ensure that each is indeed an u/m\-th power: if the given is not such a power, 
we can multiply it by a mi-th root of unity until we find a suitable one. The root of 
unity can be found by the algorithm of Theorem [T] 

Once we know aa^, the same process can be applied to compute an mg^-th root of a^i, 
and so on. 

The first step, taking a root of order v, may actually be the bottleneck of this scheme. 
When V is small compared to n, it may be better to use here as well the algorithm by 
Kaltofen and Shoup mentioned above. 

Organization of the paper. The next section reviews and discusses known algorithms; 
Section [3] gives the details of the root extraction algorithm and some experimental results. 
In all the paper, (F*)* denotes the set of t-th powers in F*. 

Acknowledgments. We thank NSERC and the Canada Research Chairs program for 
financial support. 

2 Previous work 

Let t be a prime factor of g — L In the rest of this section, we discuss previous algorithms 
for t-th root extraction, with a special focus on the case t = 2 (square roots), which has 
attracted most attention in the literature. 
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We shall see in Section |3] that given such a prime t, the cost of testing for t-th power is 
always dominated by the t-th root extraction; thus, for an input a G F*, we always assume 
that a e (F*)*. 

All algorithms discussed below rely on some form of exponentiation in F^, or in an exten- 
sion of Fg, with exponents that grow linearly with q. As a result, a direct implementation 
using binary powering uses 0(log(g)) multiphcations in Fg, that is, 0{nM{n)\og{p)) oper- 
ations in Fp. Even using fast multiplication, this is quadratic in n; alternative techniques 
should be used to perform the exponentiation, when possible. 

Some special cases of square root computation. If G is a group with an odd order 
s, then the mapping f : G ^ G, f{a) = a? is an automorphism of G; hence, every element 
a E G has a unique square root, which is a*^*"^^^/^. Thus, if g = 3 (mod 4), the square root 
of any a G (F*)^ is a^'^'^^^^^; this is because (F*)^ is a group of odd order {q — l)/2. 

More complex schemes allow one to compute square roots for some increasingly restricted 
classes of prime powers q. The following table summarizes results known to us; in each case, 
the algorithm uses 0(1) exponentiations and 0(1) additions / multiplications in F^. The 
table indicates what exponents are used in the exponentiations. 



Table 1: Some special cases for square roots 



Algorithm 


q 


exponent 


folklore 


3 


(mod 4) 




(g + l)/4 




Atkin 


5 


(mod 8) 




(g-5)/8 




Miiller [H] 


9 


[mod 16) 




- l)/4 and (g - 


9)/16 


Kong et al. [13] 


9 


'mod 16) 




-9)/8 and {q - 


9)/16 



As was said above, using a direct binary powering approach to exponentiation, all these 
algorithms use 0(nM(n) log(j9)) operations in Fp. 

Cipolla's square root algorithm. To compute the square root of a G (F*)^, CipoUa's 
algorithm uses an element h in Fg such that b'^—Aa is not a square in ¥g. Then, the polynomial 
f{Y) = Y"^ — bV + a is irreducible over Fg, hence K = Fg[y]/(/) is a field. Let y be the 
residue class of Y modulo (/). Since / is the minimal polynomial of y over Fg, A^k/f,(i/) = o, 
ensuring that ^/a = F('?+i)/2 j^q^ _jjY-\-a). 

Finding a quadratic nonresidue of the form 6^ — 4a by choosing a random 6 G Fg requires 
an expected 0(1) attempts [1, page 158]. The quadratic residue test, and the norm compu- 
tation take 0{M{n) log(?7,) + log(p)) and 0{nM{n) log(p)) multiplications in Fp respectively. 
Therefore, the cost of the algorithm is an expected 0(?t,M(?7,) log(p)) operations in Fp. 

Algorithms extending Cipolla's to the computation of t-th roots in Fp, where t is a prime 
factor of p - 1, are in [21, 123 [15]. 

The Tonelli-Shanks algorithm. We will describe the algorithm in the case of square 
roots, although the ideas extend to higher orders. Tonelli's algorithm [21] and Shanks' 
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improvement [IH] use discrete logarithms to reduce the problem to a subgroup of F* of odd 
order. Let q — 1 = T'l with (£, 2) = 1 and let H be the unique subgroup of F* of order 
£. Assume we find a quadratic nonresidue g G F*; then, the square root of a G F* can be 
computed as follows: we can express a as g^h G g'^H by solving a discrete logarithm in ¥*/H] 
s is necessarily even, so that ^/a = g^^'^h^^'^^^^'^. 

According to [16], the discrete logarithm requires 0(r^M(n)) multiplications in F^; all 
other steps take 0{nM{n) log(p)) operations in Fp. Hence, the expected running time of the 
algorithm is 0((r^ + n log(p))M(n)) operations in Fp. Thus, the efficiency of this algorithm 
depends on the structure of F*: there exists an infinite sequence of primes for which the cost 
is 0{n^M{n) \og{pf), see [22] . 

Improving the exponentiation. All algorithms seen before use at best 0{nM{n) log(p)) 
operations in Fp, because of the exponentiation. Using ideas going back to [TT], Barreto et 
al. |3] observed that for some of the cases seen above, the exponentiation can be improved, 
giving a cost subquadratic in n. 

For instance, when taking square roots with g = 3 (mod 4), the exponentiation a^'^+^^Z'^ 
can be reduced to computing - with u = p^, plus two (cheap) exponentiations 

with exponents j9(p — 1) and {p+l)/A. The special form of the exponent l + u + - ■ ■ -\-u^'^^^^^'^ 
makes it possible to apply a binary powering approach, involving 0(log(n)) multiplications 
and exponentiations, with exponents that are powers of p. 

Further examples for square roots are discussed in [131 H] , covering the entries of Table [T| 
These references assume a normal basis representation; using (as we do) the monomial basis 
and modular composition techniques (which will be explained in the next section), the costs 
become 0(M(n) log(p) + C(n)log(ra)). Some cases of higher index roots are in [2]: if t is 
a factor of p — 1, such that does not divide p — 1, and if gcd(n, t) = 1, then t-th root 
extraction can be done using 0{tM{n) log(p) + C(?7,) log(?T,)) operations in Fp. 

Kaltofen and Shoup's algorithm. Finally, we mention what is, as far as we know, the 
only algorithm achieving an expected subquadratic running time in n (using the monomial 
basis representation), without any assumption on p. 

Consider a factor t of g — 1. To compute a t-th root of a G (F*)*, the idea is simply 
to factor — a G ¥q[Y] using polynomial factorization techniques. Since we know that a 
is a t-th power, this polynomial splits into linear factors, so we can use an Equal Degree 
Factorization (EDF) algorithm. 

A specialized EDF algorithm, dedicated to the case of high-degree extension of a given 
base field, was proposed by Kaltofen and Shoup [12]. It mainly reduces to the computation 
of a trace-like quantity b + + ■ ■ ■ + b^" , where 6 is a random element in Fg[y]/ (F* — a). 
Using a binary powering technique similar to the one of the previous paragraph, this results 
in an expected running time of 0((M(t)M(?7,) log(p) + tC{n) + C{t)M{n)) log(r;,)) operations 
in Fp; remark that this estimate is faster than what is stated in [12j| by a factor log(t), since 
here we only need one root, instead of the whole factorization. In the particular case t = 2, 
this becomes 0((M(n) log(p) -f C(n)) log(n)). This achieves a running time subquadratic 
in n. 

This idea actually allows one to compute a t-th root, for arbitrary t: starting from the 
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polynomial — a, we apply the above algorithm to gcd(y* — a,y^ — y); computing 
modulo — a can be done by the same binary powering techniques. 

3 A new root extraction algorithm 

In this section, we focus on t-th root extraction in F^, for t a prime dividing q — 1 (as we 
saw in Section [l| m-th root extraction, for an integer m > 2, reduces to taking t-th roots, 
where t is a prime factor of m dividing g — 1). 

The algorithm we present uses the trace — )■ ¥q/, for some subfield ¥qi C F^ to reduce 
t-th root extraction in Fg to t-th root extraction in Fg/. We assume as before that the field ¥q 
is represented by a quotient ¥p[X]/{f), with f{X) G Fp[X] a monic irreducible polynomial 
of degree n. We let x be the residue class of X modulo (/). 

Since we will handle both Fg and Fg/, conversions may be needed. We recall that the 
minimal polynomial g G Fp[Z] of an element 6 G Fg can be computed in 0(C(?7,)-|-M(n) log(r;,)) 
operations in Fp [19]. Then, Fg/ = ¥p[Z]/{g) is a subfield of Fg = ¥p[X]/{f); given r G Fg/, 
written as a polynomial in Z, we obtain its representation on the monomial basis of Fg by 
means of a modular composition, in time C(n). We will write this operation Embed(r, Fg). 
Note that when b is in Fp, all these operations are actually free. 

3.1 An auxiliary algorithm 

We first discuss a binary powering algorithm to solve the following problem. Starting from 
A G Fg, we are going to compute 

a.(A) = + X^+P'+P'' + ... + 

for given integers i,s > 0. This question is similar to (but distinct from) some exponentia- 
tions and trace-like computations we discussed before; our solution will be a similar binary 
powering approach, which will perform 0{\og{i)) multiplications and exponentiations by 
powers of p. Let 

^. = xP^\ C(A) = XP'+P''+-+P'' and Si{X) = X^" + X^^+p'^ + ■■■ + xp'+p''+-+p'\ 

where all quantities are computed in Fg, that is, modulo /; for simplicity, in this paragraph, 
we will write ai, Q and 6i. Note that = X6i, and that we have the following relations: 

and 

^. _ Uf/T if ^ is even ^ _ iCi/2Cf/T i^ ^ i^ even ^ _ (5i/2 + 0/2^^/2^' if i is even 
Kf„^ if i is odd, 1 CiC-i if ^ is odd, + Q if ^ is odd. 

Because we are working in a monomial basis, computing exponentiations to powers of p is 
not a trivial task; we will perform them using the following modular composition technique 
from |7j. 
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Take j > and r G F^, and let R and be the canonical preimages of respectively r 
and C,j in Fp[X]; then, we have 

rP"' = R{Ej) mod /, 

see for instance [23| Chapter 14.7]. We will simply write this as r{^j), and note that it can be 
computed using one modular composition, in time C(n). These remarks give us the following 
recursive algorithm, where we assume that = x^" and (i = X'''" are already known. 

Algorithm 1 XiZetaDelta(A, ^i, Ci) 

Input: A, a positive integer i, C,i = x^" , (i = 
Output: ^i, Q, 6i 

1. if 2 = 1 then 

2. return ^i, Ci, Ci 

3. end if 

4. J [t/2\ 

5. ^j, (j, 6j ^ XiZetaDelta(A, j, ^i, Ci) 

8. <- 6j + Q6j{Q 

9. if i is even then 

10. return ^2j, C2j, ^2] 

11. end if 

13. ^Ci-C2i(6) 

14. 5i ^ 62j + Ci 

15. return ^j, Q, Sj 

We deduce the following algorithm for computing ai{\). 

Algorithm 2 Alpha(A,2) 
Input: A, a positive integer i 
Output: ctj 

1. ^1 xP^ 

2. Ci ^ 

3. ^i, Q, 6i -(r- XiZetaDelta(A, z, ^1, Ci) 

4. return X6i 

Proposition 2. Algorithm^ computes ai{X) using 0{C{n)\og{is) + M{n)\og{p)) operations 
in ¥p. 

Proof. To compute x^" and A^'' we first compute x^ and A*' using 0(log(p)) multiplications 
in ¥q, and then do 0(log(s)) modular compositions modulo /. The depth of the recursion 
in Algorithm [l] is 0(log(z)); each recursive call involves 0(1) additions, multiplications and 
modular compositions modulo /, for a total time of 0(C(n)) per recursive call. □ 

As said before, the algorithm can also be implemented using a normal basis representa- 
tion. Then, exponentiations to powers of p become trivial, but multiplication becomes more 
difficult. We leave these considerations to the reader. 
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3.2 Taking t-th roots 



We will now give our root extraction algorithm. As said before, we now let t be a prime 
factor of g — 1, and we let s be the order of p in Z/tZ. Then s divides n, say n = si. 

We first explain how to test for t-th powers. Testing whether a G F* is a t-th power is 
equivalent to testing whether a^^-^)/* = 1. Let C = a^P'-^^/*; then a^^-^)/* = (^i+p'+-+p^'-^^\ 
Computing ( requires 0(sM(n) log(p)), and computing (^^'^p"'^ '"p'^ using Algorithm [l] 
requires 0{C{n) log(?7,) + M{n) log(p)) operations in Fp. Therefore, testing for a t-th power 
takes 0{C{n) log(n) + sM(n) log(p)) operations in Fp. 

In the particular case when t divides p — 1, we can actually do better: we have a*^'"^-*/* = 
res(/, a)^^"^^/*, where res(-, ■) is the resultant function. The resultant can be computed using 
0(M(n) log(n)) operations in Fp, so the whole test can be done using 0(M(n) log(n) +log(p)) 
operations in Fp. 

In any case, we can now assume that we are given a G (F*)*, with t-th root 7 G Fg. 
Defining (3 = Tf^^/f^, (7), where T^^/f^, : Fg — > Fg/ is the trace linear form and q' = p'^, we 
have 



e-i 

p 



= 7(1 + a(P^-^)/* + a(^'"-i)/* + ■ ■ ■ + a(^''"''^-^)/*). (1) 

Let b = l + a(P'-i)/* + a(p''-^)/* + - ■ ■ + a(p''"'''-i)/*, so that Equation Q gives (3 = -fb. Taking 
the t-th power in both sides results in the equation /3* = a6* over Fg/. Since we know a, 
and we can compute b, we can thus determine /3 by t-th root extraction in Fg/. Then, if we 
assume that 6 7^ (or equivalently that 7^ 0), we deduce 7 = (3b~^; to resolve the issue 
that (3 may be zero, we will replace a by a' = ac*, for a random element c G F*. 

Computing the t-th root of in Fg/ is done as follows. We first compute the minimal 
polynomial g G Fp[Z] of a'6*, and let z be the residue class of Z in ¥p[Z]/{g). Then, we 
compute a t-th root r of 2 in Fp[Z]/(5f), and embed r in Fg. The computation of r is done 
by a black-box t-th root extraction algorithm, denoted by r i— )■ r^/*. 

It remains to explain how to compute b efficiently. Let A = a^'~^^^*] then, one verifies 
that 6 = 1 + A + ai-2{X), so we can use the algorithm of the previous subsection. Putting 
all together, we obtain the following algorithm 



Algorithm 3 t-th root in F* 



1' 



Input: a G (F 
Output: a t-th root of a 

1. s ^ the order of p in Z/tZ 

2. £ ^ njs 

3. repeat 

4. choose a random c G F„ 



5. a' ac* 



6. A ^ a' 



,{p--i)/t 
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7. 6^ l + A + Alpha(A,^-2) 

8. until b ^0 

9. g MinimalPolynomial(a'6*) 

10. in¥g[Z]/{g) 

11. return Embed(/3, Fg)6~^c"^ 

The following proposition proves Theorem [TJ 

Proposition 3. Algorithm^ computes a t-th root of a using an expected 0(sM(n) log(p) + 
C(n)log(n)) operations in ¥p, plus a t-th root extraction in Fg/. 

Proof. Note first that (3 = means that Tif^/f^, (7c) = 0. There are q/q' values of c for which 
this is the case, so we expect to have to choose 0(1) elements in Fg before exiting the repeat 
. . .until loop. Each pass in the loop uses 0(sM(n) log(p)) operations in Fp to compute a' 
and A, and 0(C(n) log(?T,) + M(n) log(p)) operations in Fp to compute b. 

Given a' and b, one obtains 6* and a'6* using another 0(sM(n) log(p)) operations in Fp; 
then, computing g takes time 0{C{n) + M{n) log(?T,)). 

After the black-box call to t-th root extraction modulo g, embedding P in F^ takes time 
C{n). We can then deduce 7 by two divisions in F^, using 0{M{n) log(r;,)) operations in Fp; 
this is negligible compared to the cost of all modular compositions. □ 

3.3 Experimental results 

We have implemented our root extraction algorithm, in the case m = 2 (that is, we are 
taking square roots); our implementation is based on Shoup's NTL [20]. Figure [l] compares 
our algorithm to CipoUa's and Tonelli-Shanks' algorithms over Fg, with q = p", for the 
randomly selected prime p = 348975609381470925634534573457497, and different values of 
the extension degree n. 
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Figure 1: Our square root algorithm vs. Cipolla's and Tonelli-Shanks' algorithms. 
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Remember that the bottleneck in CipoUa's and Tonelh-Shanks' algorithms is the expo- 
nentiation, which takes 0(nM(n) log(p)) operation in ¥p. As it turns out, NTL's imple- 
mentation of modular composition has u = 2; this means that with this implementation 
we have C(n) = 0(?t,^), and our algorithm takes expected time 0{M{n)log{p) + r;,^log(n)). 
Although this implementation is not subquadratic in n, it remains faster than CipoUa's and 
Tonelli-Shanks' algorithms, in theory and in practice. 

Next, Figure [2] compares our NTL implementation of the EDF algorithm proposed by 
Kaltofen and Shoup, and our square root algorithm (note that the range of reachable degrees 
is much larger that in the first figure). We have ran the algorithms for several random 
elements for each extension degree. The vertical dashed lines, and the green line show 
the running time range, and the average running time of Kaltofen and Shoup algorithm 
respectively. In the case of our algorithm (the red graph), the vertical ranges are invisible 
because the deviation from the average is ~ 10~^ seconds. 
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Figure 2: Our algorithm vs. Kaltofen and Shoup's algorithm. 

This time, the results are closer. Nevertheless, it appears that the running time of our 
algorithm behaves more "smoothly", in the sense that random choices seem to have less 
influence. This is indeed the case. The random choice in Kaltofen and Shoup's algorithm 
succeeds with probability about 1/2; in case of failure, we have to run to whole algorithm 
again. In our case, our choice of an element c in F* fails with probability 1/p <^ 1/2; then, 
there is still randomness involved in the t-th root extraction in Fp, but this step was negligible 
in the range of parameters where our experiments were conducted. 

Another way to express this is to compare the standard deviations in the running times of 
both algorithms. In the case of Kaltofen-Shoup's algorithm, the standard deviation is about 
l/^/2 of the average running time of the whole algorithm. For our algorithm, the standard 
deviation is no more than 1/y^ of the average running time of the trace-like computation 
(which is the dominant part), plus l/\/2 of the average running time of the root extraction 
in Fp (which is cheap). 
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